rgdal and sfThere are loads of spatial mapping/plotting packages in R. The main two ways to read in spatial data use the rgdal package, and the sf package. Let’s look at how to load/plot line and polygon data.
Let’s load packages first:
library(rgdal); # spatial/shp reading
library(dplyr); # wrangling data/plotting
library(readr);
library(viridis); # nice color palette
library(sf); # newer "simple features" spatial package
library(USAboundaries); # state/county data
#library(Imap); # nice mapping/color functions col.map
library(ggrepel) # for labelingFor this example, let’s use some Hydroshed Data.
I’ve downloaded rivers for CA and OR and put them here on github. Download the zipped file and unzip it in a data folder. We’re going to use shapefiles for the remainder of this example.
rgdalLet’s load a polyline or line shapefile of rivers of California and Oregon. The result is a SpatialLinesDataFrame in your R environment.
# we can use ogrInfo to see CRS, attributes, etc.
ogrInfo(dsn="./data", layer="rivs_CA_OR_hydroshed") # see shapefile info## Source: "/Users/ryanpeek/Documents/github/teaching/mapping-in-R-workshop/data", layer: "rivs_CA_OR_hydroshed"
## Driver: ESRI Shapefile; number of rows: 21976
## Feature type: wkbLineString with 2 dimensions
## Extent: (-124.5429 32.58542) - (-114.1315 46.26042)
## CRS: +proj=longlat +datum=WGS84 +no_defs
## LDID: 87
## Number of fields: 26
## name type length typeName
## 1 ARCID 0 9 Integer
## 2 UP_CELLS 2 24 Real
## 3 statefp 4 80 String
## 4 statens 4 80 String
## 5 affgeoid 4 80 String
## 6 geoid 4 80 String
## 7 stusps 4 80 String
## 8 name 4 80 String
## 9 lsad 4 80 String
## 10 aland 2 24 Real
## 11 awater 2 24 Real
## 12 state_name 4 80 String
## 13 state_abbr 4 80 String
## 14 jurisdicti 4 80 String
## 15 statefp.1 4 80 String
## 16 statens.1 4 80 String
## 17 affgeoid.1 4 80 String
## 18 geoid.1 4 80 String
## 19 stusps.1 4 80 String
## 20 name.1 4 80 String
## 21 lsad.1 4 80 String
## 22 aland.1 2 24 Real
## 23 awater.1 2 24 Real
## 24 state_na_1 4 80 String
## 25 state_ab_1 4 80 String
## 26 jurisdic_1 4 80 String
# then read in the shapefile
rivers_sp<- readOGR(dsn = "./data", layer = "rivs_CA_OR_hydroshed") # takes a few seconds!## OGR data source with driver: ESRI Shapefile
## Source: "/Users/ryanpeek/Documents/github/teaching/mapping-in-R-workshop/data", layer: "rivs_CA_OR_hydroshed"
## with 21976 features
## It has 26 fields
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
sfHere’s how to do the same thing using the sf package. Notice two important differences, the sf package reads things in as a regular dataframe, with the spatial component of the data contained inside a geometry list-column within the dataframe. That means you can operate on this data as you would any data frame. The other main difference, is that reading shape data in is much faster with sf.
# notice the simple structure, but results in dataframe
rivers_sf <- st_read("data/rivs_CA_OR_hydroshed.shp") # fast!## Reading layer `rivs_CA_OR_hydroshed' from data source `/Users/ryanpeek/Documents/github/teaching/mapping-in-R-workshop/data/rivs_CA_OR_hydroshed.shp' using driver `ESRI Shapefile'
## Simple feature collection with 21976 features and 26 fields
## geometry type: MULTILINESTRING
## dimension: XY
## bbox: xmin: -124.5429 ymin: 32.58542 xmax: -114.1315 ymax: 46.26042
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## Coordinate Reference System:
## EPSG: 4326
## proj4string: "+proj=longlat +datum=WGS84 +no_defs"
No different here, process is the same. But let’s take a look at a package that might be helpful for folks working with state/county boundaries.
A nice package for pulling county/state data is the USAboundaries package. Importantly, this package pulls these data in as sf features (dataframes), not as rgdal or SpatialPolygonDataFrames data.
Let’s show this in two steps, the first is how to grab a sf feature for a given state or states.
library(USAboundaries)
# Pick a State
state_names <- c("california")
# Download STATE data and add projection
CA<-us_states(resolution = "high", states = state_names)
st_crs(CA)## Coordinate Reference System:
## EPSG: 4326
## proj4string: "+proj=longlat +datum=WGS84 +no_defs"
That was easy…what about counties? We can use the same type of call, but let’s add some dplyr and purrr functionality here to add the X and Y values for the centroid of each polygon (county) we download. In this case we use map_dbl because it will take a vector or values (the geometry col here), map a function over each row in that vector, and return a vector of values (the centroid points).
library(purrr)
# Pick some CA counties
co_names <- c("Butte", "Placer", "El Dorado", "Nevada", "Yuba", "Sierra", "Plumas")
# get COUNTY data for a given state
counties_spec <- us_counties(resolution = "low", states=state_names) %>% # use list of state(s) here
filter(name %in% co_names) %>% # filter to just the counties we want
# add centroid values for labels using the geometry column
mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
lat=map_dbl(geometry, ~st_centroid(.x)[[2]])) First let’s show a few examples comparing how to plot the rivers dataset using sf.

Okay, let’s add in the county/state data, and figure out a few tricks to making our map a bit cleaner.
plot and sf# note adding some alpha adjustment here and using "cex" to make larger
plot(counties_spec$geometry, col=adjustcolor("maroon", alpha=0.5), cex=1.5, axes=TRUE, main="Selected CA Counties")
# now add rivers to existing plot, with add=TRUE
plot(rivers_sf$geometry, col=adjustcolor("blue", alpha=0.7), add=TRUE)
# now add some labels using the centroid lat/long we added earlier
text(counties_spec$lon, counties_spec$lat, labels = counties_spec$name)
Unforunately this map isn’t very good. How can we improve it? Seems it would be nicer to crop the river layer to the counties of interest, or at least center things on that area. Also might be nice to have the labels on top and not obscured by the rivers.
# get range of lat/longs from counties for mapping
mapRange1 <- st_bbox(counties_spec) # view bounding box
# plot rivers
plot(rivers_sf$geometry, col=adjustcolor("blue", alpha=0.7),
xlim=mapRange1[c(1,3)], ylim = mapRange1[c(2,4)], axes=TRUE,
main="Selected CA Counties and Rivers")
# add counties
plot(counties_spec$geometry, col=adjustcolor("maroon", alpha=0.5), cex=1.5, add=TRUE)
# add labels for counties
text(counties_spec$lon, counties_spec$lat, labels = counties_spec$name,
col=adjustcolor("white", alpha=0.8), font = 2)
That’s a little better…but still not quite what we want. Lots of extra ink going toward stuff that we don’t need (i.e., rivers outside of counties). What about cropping the river layer to our county layer?
sfGreat thing is that sf has some really nice tools for this, just as any GIS program would. Here let’s use st_intersection to crop our river layer to only rivers in the counties of interest.
## although coordinates are longitude/latitude, st_intersection assumes that they are planar
## Warning: attribute variables are assumed to be spatially constant
## throughout all geometries

# buffer a single county? Warning is ok as well...has to do with lat/lon
county_buff <- st_buffer(counties_spec[counties_spec$name=="El Dorado",], dist = .05) # note this is a buffer of decimal degrees## Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =
## endCapStyle, : st_buffer does not correctly buffer longitude/latitude data
## dist is assumed to be in decimal degrees (arc_degrees).
# now plot
plot(counties_spec$geometry, axes=TRUE)
plot(county_buff$geometry, border="maroon", col=adjustcolor("maroon", alpha=0.5), add=TRUE)
Great, let’s try our plot again.
# try again, let's switch layer ordering
plot(counties_spec$geometry, col=adjustcolor("maroon", alpha=0.2), cex=1.5, axes=TRUE, main="Selected CA Counties")
plot(rivers_crop$geometry, col=adjustcolor("blue", alpha=0.7),add=TRUE)
plot(CA$geometry, add=TRUE, lwd=2)
text(counties_spec$lon, counties_spec$lat, labels = counties_spec$name,
col="maroon", font = 2)
Ok! Not bad. What can we improve?
ggplot2 and sfWe can use ggplot2 to plot our sf data! Many more options using this platform for sf. Mapping with ggplot2 brings some extra things we can fiddle with. Since these data are all data.frames (sf features), we can use the geom_sf function.
One important note…there’s a common error that you may get semi-regularly, which will look something like this:
Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
polygon edge not found
Don’t worry about this, it has to do with the grid package…it’s annoying but all you need to do is replot your map/figure (sometimes several times) and eventually the plot will work/show up.

# not cropped
ggplot() +
geom_sf(data=CA, color = "gray30", lwd=2, fill=NA) + # California border
geom_sf(data=counties_spec, fill = NA, show.legend = FALSE, color="gray50", lwd=0.4) + # counties
geom_label_repel(data=counties_spec, aes(x=lon, y=lat, label=name)) +
theme_bw()
Great, this is a nice example…but what if we want to crop to the area of interest like we did previously with the st_bbox? Well, we can add a call to the coord_sf() to limit the range we’re interested in. Let’s also add an “annotation” to our map to delineate the CA state line.
# with cropped range (to only our selected counties)
ggplot() +
geom_sf(data=CA, color = "gray30", lwd=2, fill=NA) +
geom_sf(data=counties_spec, fill = NA, show.legend = F, color="gray50", lwd=0.4) +
geom_sf(data=rivers_crop, col="blue", alpha=0.8, size=0.5)+
geom_label_repel(data=counties_spec, aes(x=lon, y=lat, label=name)) +
coord_sf(xlim = mapRange1[c(1,3)], ylim = mapRange1[c(2,4)]) +
theme_bw(base_family = "Helvetica") + # change to "sans" if this font not available
labs(title="Selected CA Counties and Riverlines")+
theme(panel.grid.major = element_line(color = "gray80", linetype = 1)) + # change grid
annotate("text", x=c(-120.1,-119.9), y=40.3, label=c("CA","NV"), color="gray40", size=3, fontface=2, angle = 90)
ggmap BackgroundAnother option is to use the ggmap package…which unforunately also requires you register for an API key…which is a bit longer of a process than I have time to go into. Suffice to say, follow the steps outlined on the ggmap repository.
library(ggmap)
location=c(-121,39.5) # set the center of the map
# set the background map up
map1 <- get_map(location=location, crop = F,
color="bw",
maptype="terrain",
source="google",
zoom=8)
ggmap(map1) +
geom_sf(data=CA, color = "gray30", lwd=2, fill=NA, inherit.aes = F) +
geom_sf(data=counties_spec, fill = NA, show.legend = F, color="gray50", lwd=0.4, inherit.aes = F) +
geom_sf(data=rivers_crop, col="blue", alpha=0.8, size=0.5, inherit.aes = F)+
geom_label_repel(data=counties_spec, aes(x=lon, y=lat, label=name)) +
coord_sf(xlim = mapRange1[c(1:2)], ylim = mapRange1[c(3:4)]) +
theme_bw(base_family = "Roboto Condensed") +
labs(title="Selected CA Counties and Riverlines")
We can add our sf data directly into a leaflet map using the mapview package.